Reliability and Maintainability Analysis of a High Air Pressure 

Compressor Facility 

Fayssal M. Safie, Ph.D. • NASA Marshall Space Flight Center • Huntsville, AL 

Robert W. Ring • Bastion Technologies, Inc. • NASA Marshall Space Flight Center • Huntsville, AL 

Stuart K. Cole, Ph.D., P.E. • NASA Langley Research Center • Hampton, VA 

Key Words: Compressor Station, Reliability, Availability, Maintainability, Cost Analysis 


SUMMARY & CONCLUSIONS 

This paper discusses a Reliability, Availability, and 
Maintainability (RAM) independent assessment conducted 
to support the refurbishment of the Compressor Station at 
the NASA Langley Research Center (LaRC). The paper 
discusses the methodologies used by the assessment team to 
derive the repair by replacement (RR) strategies to improve 
the reliability and availability of the Compressor Station 
(Ref.l). This includes a RAPTOR simulation model that 
was used to generate the statistical data analysis needed to 
derive a 15 -year investment plan to support the 
refurbishment of the facility. To summarize, study results 
clearly indicate that the air compressors are well past their 
design life. The major failures of Compressors indicate that 
significant latent failure causes are present. Given the 
occurrence of these high-cost failures following compressor 
overhauls, future major failures should be anticipated if 
compressors are not replaced. Given the results from the 
RR analysis, the study team recommended a compressor 
replacement strategy. Based on the data analysis, the RR 
strategy will lead to sustainable operations through 
significant improvements in reliability, availability, and the 
probability of meeting the air demand with acceptable 
investment cost that should translate, in the long run, into 
major cost savings. For example, the probability of meeting 
air demand improved from 79.7 percent for the Base Case 
to 97.3 percent. Expressed in terms of a reduction in the 
probability of failing to meet demand (1 in 5 days to 1 in 37 
days), the improvement is about 700 percent. Similarly, 
compressor replacement improved the operational 
availability of the facility from 97.5 percent to 99.8 percent. 
Expressed in terms of a reduction in system unavailability 
(1 in 40 to 1 in 500), the improvement is better than 1000 
percent (an order of magnitude improvement). 

It is worthy to note that the methodologies, tools, and 
techniques used in the LaRC study can be used to evaluate 
similar high value equipment components and facilities. 
Also, lessons learned in data collection and maintenance 
practices derived from the observations, findings, and 
recommendations of the study are extremely important in 


the evaluation and sustainment of new compressor 
facilities. 


1. BACKGROUND 

The Langley Research Center’s (LaRC) High Pressure, Air- 
Compressor Station provides high-pressure compressed air 
at relatively high daily volumes for use at approximately 25 
research facilities around LaRC. The Compressor Station 
has been in continuous operation for over 60 years. Three 
of their six compressors currently in service have been 
operating since the early 1950’s. Despite efforts to upgrade 
and refurbish the compressors, the Station continues to be 
challenged with frequent downing events, obsolete 
equipment, and aging infrastructure. Consequently, LaRC 
management requested NASA’s Safety Center conduct an 
independent reliability and availability assessment of the 
Compressor Station and make recommendations for 
ensuring long-term sustainment of operations. 

1.1 Assessment Tasks 

The independent assessment was structured into 
three subtasks as follows: 

Subtask 1: Assess System Reliability and Availability 

• Review previous problems and failures, develop a 
failure database to quantify Station availability and 
make recommendations concerning data collection 
and trending. 

• Quantify component availability as compared to 
new equipment. 

Subtask 2: Assess New Equipment Alternatives 

• Assess current state-of-the-art industrial systems 
available for a repair-by-replacement strategy for 
all major systems. 

Subtask 3: Make Recommendations on Specific Questions 

• Is it economically prudent to continue on the path 
of refurbishment and upgrading of the current suite 
of compressors, dryers, valves and ancillary 
systems or is a repair-by-replacement a better 
option? 



• How should the facility ensure a given daily output 
capacity: with multiple machines or a single 
dependable machine with rapid access to repair 
parts? 

• Which system(s) should receive the most attention 
to improve reliability, especially if a repair-by- 
replacement posture is taken: compressors, valves, 
maintenance/spares, operations, other? 

• Would new compressors be more dependable than 
the existing ones considering the Compressor 
Station’s operational situation (i.e., starting and 
stopping compressors every day)? 

• What should the Compressor Station look like in 
10 years? How do we get there? 

1.2 The Compressor Facility 

High pressure air is provided by six, 6-stage 
compressors (Ref. 2). Three of these machines deliver 8 
lbs/sec at 6,000 psi via Worthington BDC (Dresser-Rand) 
reciprocating compressors each with a 4,000 hp 
synchronous motor. The three smaller Clark CRA 
reciprocating compressors deliver 2.5 lbs/sec at 5,000 psi 
each with a 1,250 hp synchronous motor. There are five 
high-pressure, desiccant drying systems using activated 
alumina desiccant. Two of the small compressors share one 
dryer; all other compressors are each connected to a single 
dryer. Air delivery is via piping through an underground 
tunnel system from a series of storage bottle fields to the 
research facilities. The bottle field storage capacity is 
36,000 cu-ft at 6,000 psi and 27,000 cu-ft at 5,000 psi. 

Air Operations are planned and managed through 
weekly meetings at the Compressor Station by updating 
rolling three-week projections, based on inputs from the 
research facilities. The histogram of daily requests is 
illustrated in Figure 1. 
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Figure 1 Daily Air Requests Histogram 

The practice at the Compressor Station is to run 
available compressor(s) for a minimum of two to four hours 
per day. The Hypersonic Air breathing Propulsion 8-Foot 
High-Temperature Tunnel, the world’s largest high pressure 
wind tunnel, requires 6,000 psi minimum air pressure for 


their research tests. While some research facilities require 
pressures in the 3,000 to 4,000 psi range and low volume; 
others use high-pressure air in the 3,000 to 5,000 psi range 
(Ref. 3). 

The interrelationships of the various Compressor 
Station systems are illustrated in Figure 2. The major units 
consist of Compressors, Dyers, Cooling Towers, Oil 
Skimmer, the Vent System, and the Bottle Field storage 
system. Air from Compressors #1 and #2 are dried through 
Dryer #1. The capacity of Dryer #1 is limited to the output 
of one compressor at a time. If one compressor is 
operating, the other would be in standby mode or down for 
maintenance or repair. 

The Station’s normal schedule is based on two 8 
hour shifts per day (M-F). However, due to frequent 
equipment outages, the operators often must work a third 
shift and/or weekends in order to satisfy daily demand. 
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Figure 2 Compressor Station Diagram 


2. RELIABILITY AND AVAILABILITY EVUATION 
2 . 1 The Raptor Tool 

The Raptor tool, which was used in this study, is a 
Windows-based simulation program offered by ARINC 
Corporation that provides modeling capabilities for 
reliability, cost, and capacity trade-off studies. Raptor’s 
graphic interface uses drag-and-drop reliability-block- 
diagram (RBD) format to simulate operations analysis with 
an emphasis on availability. 


2.2 Reliability Analysis 

The reliability assessment comprised three steps: 1) 
development of a reliability database, which was used to 













estimate MTBF of compressors, dryers, and cooling towers, 
2) development of the system-level reliability simulation 
using the Raptor tool, and 3) evaluation of the system 
reliability for each of the sustainment options. 

Compressor Station maintenance records were 
screened and failures were assigned to the applicable 
systems. Exposure times were estimated from compressor 
hour-meter readings in the Facility Maintenance Log 
database. For repairable components, assuming an 
exponential distribution for time between failures, given N 
failures and total exposure time T, the Maximum 
Likelihood Estimator of the MTBF is given simply as T/N. 
These concepts were used to estimate the failure rate/hour 
and the MTBF (equal to the reciprocal of the mean failure 
rate) of the various subsystems. Table 1 presents exposure 
times for the compressors and associated dryers and cooling 
towers. 


Table 1 Exposure Times 


SUBSYSTEM 

Hours 

Compressor #1 

4,704 

Compressor #2 

885 

Compressor #3 

1,061 

Compressor #4 

8,941 

Compressor #5 

7,692 

Compressor #6 

6,717 

Cooling Tower #1 

13,367 

Cooling Tower #2 

16,633 

Dryer #1 

5,589 

Dryer #2 

1,061 

Dryer #3 

8,941 

Dryer #4 

7,692 

Dryer #5 

6,717 


Compressor Downtime distribution is represented as the 
sum of three distributions: Pre-repair Logistic Downtime, 
Repair Time, and Post-repair Logistic Delay. These 
distributions were developed from elicitations with the 
Compressor Station Facility Process Engineer and summed 
using Monte Carlo simulation to produce a distribution for 
total downtime as illustrated in Figure 3. The mean 
downtime was estimated to be 56.4 hours with a standard 
deviation of 36 hours. 



MTBF estimates for new replacement equipment were 
estimated from the following sources: 

• Dresser-Rand Corporation (Ref. 4) 

• Sloan Brothers Co. (Watchman lubrication system) 

• Naval Surface Warfare Center (NSWC) reliability 
calculator provided by ALD, Inc. 

• Nuclear regulatory Commission/ Institute of Nuclear 
Power Operations EPIX (an equipment failure data 
base), NUREG/CR6928 , NUREG/CR7 03 7 , and 
NUREG /CR5419 (valve and dryer failure rates) 

2.3 Availability Analysis 

Performance statistics were calculated using simulation 
data from a minimum of 12,000 hours of continuous 
operation. This represented at least three years assuming 
the plant operates 16 hours per day, Monday through 
Friday, excluding holidays. The following system metrics 
were used to assess and compare the various alternatives: 

• Operational Availability (A 0 ) = Uptime/(Uptime + 
Downtime) 

• The probability of meeting daily demand working two 
shifts (no overtime) 

• The probability of meeting daily demand if a third shift 
(overtime) is allowed 

• The percentage of days that a third shift is required 

• Daily air capacity relative to daily air requested (excess 
or shortage) 

• Total cost (Present Worth and Annual Worth) in FY 1 1 
constant worth dollars 

o O&M 
o PP&E 

The analysis used Compressor Station daily air request 
data from October 4, 2008 through April 29, 2011, 
excluding weekends and holidays. This air request data 
was plotted as an empirical cumulative probability 
distribution in Figure 4 in order to highlight the percentiles 
of the distribution. The 90 th percentile is 870 klbs of air. In 


other words, the amount of air requested on any given day 
has a probability of being less than or equal to 870 klbs. 

The Compressor Station’s goal is to meet or exceed 
requests at least 90 percent of the time. Hence, the demand 
curve provides a yardstick against which to measure the 
probability of meeting their stated goal. 

The maximum air capacity of the system is 
degraded whenever critical components become 
unavailable due to scheduled or unscheduled downing 
events. Management can decide to schedule a third shift 
whenever the day’s air production falls short of the 
requested amount of air. This operational flexibility needed 
to be taken into account in the analysis. 

The availability and reliability of the compressors 
and other ancillary equipment, such as dryers and valves, 
was evaluated using the Raptor tool; but, Raptor was unable 
to factor in operational decisions based on the level of air 
output. To compensate, an Excel spreadsheet tool was 
developed to process Raptor’s detailed event file and 
schedule overtime when necessary. The event file details 
system operations versus time. Whenever a system event 
occurred in Raptor (such as when a compressor failed, a 
dryer failed, compressor came back up, etc.) Raptor 
recorded the event and the time of its occurrence in the 
event file. The Excel post processor used this detailed 
event information to create a daily summary of the total 
amount of air produced in 8 -hour shifts. 

The post processor decided whether two shifts or 
three needed to be worked. After two shifts of Raptor data 
was processed, the air produced was compared to the 
amount requested. If the day’s demand was met or 
exceeded, the processor began to process the next string of 
data as a new work day. If demand was not met after two 
shifts, the tool continued to process a third 8-hour shift. 
After the end of the third shift, either the day’s demand was 
met or it was not. Failure to meet demand was counted a 
failure. 



Figure 4 Daily Air Requests Distribution 


3. THE REPLACEMENT PLAN 

The study team recognized the need to evaluate 
compressor replacement strategies rather than specific 
replacement candidates. The notion of a strategy 
emphasizes the long-term nature of any feasible 
sustainment plan. The Base Case for evaluation, against 
which the other alternatives were compared, is a variant of 
the refurbishment strategy that has been in place since 
2007. 

The replacement strategy included four equipment 
replacement Phases as described below. Following the 
Base Case, which begins at the end of FY 1 1 , subsequent 
Phases are spaced on 3 -year intervals beginning in FY14 
and extending through FY26. Each new Phase represents 
the opportunity to replace an aging compressor with a new 
compressor. The Phases build on one another as each new 
compressor is brought on line and an old compressor is 
salvaged. The recommended order in which compressors 
are replaced is based on a number of factors including, age, 
condition, and capacity. Reliability, availability and cost 
metrics are presented by Phase over the entire planning 
horizon, which extends through 2026. For example, the 
Base Case is one in which Phase 1 through 4 are not 
exercised. Consequently, it extends throughout the 
planning horizon under the assumptions defined above - 
without replacing any of the existing compressors. 

Phase 1 begins with the Base Case and then in FY 14 
replaces the three small compressors with one new large (8 
lbs/sec.) compressor. It also requires a one new dryer to 
replace Dryers #1 and #2. 

Phase 2 begins with Phase 1 and assumes Compressor #6 is 
replaced in FY 1 7 with a new large compressor of the same 
capacity. Phase 2 does not require replacement of Dryer 
#2. 

Phase 3 begins with Phase 2 and assumes Compressor #4 
or #5, whichever one is selected for salvage at that time, is 
replaced in FY20. Phase #3 does not require a replacement 
of the dryer. 

Phase 4 begins with Phase 3 and assumes the remaining old 
compressor (#4 or #5) is replaced in FY23. Phase 4 does 
not require replacement of its dryer. As an alternative to 
Phase 4, salvage the remaining old compressor and do not 
replace it. 


4. SUMMARY OF RESULTS 

The RR analysis shows that investing in new 
compressors can significantly increase the system 
availability, reliability, and capacity, which in combination 
increase the probability of meeting air demand. The two- 


shift probability of meeting air demand improved from 79.7 
percent for the Base Case to 97.3 percent for Option 4 as 
shown in Figure 5. 


Figure 7 Present Worth Total Cost of Alternatives 


Expressed in terms of a reduction in the probability of 
failing to meet demand (1 in 5 days to 1 in 37 days), the 
improvement is about 700 percent. Similarly, compressor 
replacement improved the operational availability of the 
facility from 97.5 percent for the Base Case to 99.8 percent 
for Option 4. Expressed in terms of a reduction in system 
unavailability (1 in 40 to 1 in 500 days), the improvement is 
better than 1000 percent (an order of magnitude). The total 
cost of investments in Plant Property and Equipment 
(PP&E) in constant worth FY 1 1 dollars to achieve this 
improvement would be about $12M. This would be offset 
by a reduction in total cost for operation and maintenance 
(O&M) of $4.3M resulting in a net increase of $7.7M over 
15 years as shown in Figure 7. However, since the analysis 
did not account for the escalating O&M cost due to 
increasing failure rates over time for the existing 
compressors, and did not credit the reduction in O&M cost 
with new compressors due to shorter downtimes; the net 
increase in total cost due to compressor replacement should 
be offset by cost savings in the long run. 


• Use the provided Replacement plan as the basis for the 
refurbishment of the compressor facility. 

• Ensure that the catastrophic failure modes and hazards 
of compressor facility equipment are well understood 
and that proper controls are in place (foundations, 
piping, pressure bottles, etc.). 


5. FINDINGS 

The ages of the LaRC compressors are well in 
excess of 30 years with most having been built in 
the 1950 - 1990 timeframe. 

Reactive (corrective) maintenance, as was 
practiced by the LaRC compressor station, resulted 
in higher maintenance costs and an increase in 
unscheduled downing events. 

The primarily reactive maintenance approach (fail- 
fix) of the existing compressor facility runs a 
greater risk of safety-related catastrophic events. 

Personnel safety is threatened due to the lack of 
knowledge of the current status of pressure 
components (e.g., valves, piping, and pressure- 
containing tanks). 

Failure data collected by the Compressor Station is 
not currently in a form that can be used to predict 
equipment reliability without extensive data 
conditioning. 

Heat exchangers within the compressors, primarily 
the intercoolers and aftercoolers, and other heat 
exchangers in the system are subject to corrosion. 
These components will exceed their 20 year life 
within the planning horizon of this study and will 
need to be replaced. 


6. RECOMMENDATIONS 
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Replacement Strategy 


• Work with the facility maintenance contractor to 
develop better maintenance data practices and data- 
gathering information for systems and components. 

• Fund, implement, and execute a reliability-centered 
maintenance (RCM) program, which includes 
preventive, predictive, and corrective maintenance 
activities. Implementing an RCM program will help 
identify critical parts and life-limited components, as 
well as identifying appropriate maintenance activities 
and scheduling to ensure reliability optimization. 
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